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ABSTRACT 



Rate shaping is provided in per- flow queued routing mecha- 
nisms for statistical bit rate service. 
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RATE SHAPING IN PER-FLOW OUTPUT 
QUEUED ROUTING MECHANISMS FOR 
STATISTICAL BIT RATE SERVICE 

Pursuant to 35 U.S.C.120, the priority of the inventors' 
earlier filed provisional application filed on Jun. 27, 1996 
under Ser. No. 60/020,645 is claimed. 

I. FIELD OF THE INVENTION 

This invention relates to packet switched communication 
networks and, more particularly, to traffic shaping for caus- 
ing the time multiplexed packet flows at queuing points 
within such networks or network elements to conform to 
specified traffic descriptors. 

II. CROSS REFERENCES TO RELATED 
APPLICATION 

For other concurrent filings on traffic shaping see Attor- 
ney Docket No. D/96271P by Christopher J. Kappler et al., 
entitled "Multiple Rate Sensitive Priority Queues for Reduc- 
ing Relative Data Transport Unit Delay Variations in Time 
Multiplexed Outputs from Output Queued Routing 
Mechanisms," Attorney Docket No. D/96271Q2P by Joseph 
B. Lyles et al., entitled "Rate Shaping in Per-Flow Queued 
Routing Mechanisms for Available Bit Rate Service," Attor- 
ney Docket No. D/96271Q3P by Joseph B. Lyles, entitled 
"Rate Shaping in Per-Flow Output Queued Routing Mecha- 
nisms for Available Bit Rate (ABR) Service in Networks 
Having Segmented ABR Control Loops," Attorney Docket 
No. D/96271Q4P by Joseph B. Lyles et al, entitled "Rate 
Shaping in Per-Flow Output Queued Routing Mechanisms 
for Unspecified Bit Rate Service," Attorney Docket No. 
D/96271Q5Pby Christopher J. Kappler et al, entitled "Rate 
Shaping in Per-Flow Output Queued Routing Mechanisms 
Having Output Links Servicing Multiple Physical Layers." 

III. BACKGROUND OF THE INVENTION 
A. Traffic Contracts/Definitions 

Most applications that are currently running on packet 
switched communication networks can function acceptably 
with whatever bandwidth they happen to obtain from the 
network because they have "elastic" bandwidth require- 
ments. The service classes that support these applications is 
known as "best efforts" service in the Internet community 
and as "Available Bit Rate" (ABR) in the Broadband ISDN/ 
ATM community. 

There is, however, a growing demand for network ser- 
vices that provide bounded jitter or, in other words, bounded 
packet delay variation (commonly referred to as cell delay 50 
variation in an ATM context). For example, this type of 
service is required for real time applications, such as circuit 
emulation and video. It is not clear whether and how the 
Internet community will respond to this demand, but the 
Broadband ISDN/ATM community has responded by intro- 55 
ducing the notion of a user-network negotiated traffic con- 
tract. 

As is known, a user-network ATM contract is defined by 
a traffic descriptor which includes traffic parameters, toler- 
ances and quality of service requirements. A conformance 
definition is specified for each of the relevant traffic param- 
eters. Accordingly, ATM services may make use of these 
traffic parameters and their corresponding conformance 
specifications to support different combinations of Quality 
of Service (QoS) objectives and multiplexing schemes. 

Partially overlapping sets of ATM traffic classes have 
been defined by the Telecommunications Standardization 



Sector of the International Telecommunications Union 
(ITU-T) and the ATM Forum. In some instances, traffic 
classes which have essentially identical attributes have been 
given different names by these two groups, so the following 
name translation table identifies the existing equivalent 
counterparts: 



25 



30 



35 



40 



45 



60 



65 



ITU-T Traffic Class 



ATM Forum Traffic Class 



ABR 

Deterministic Bit Rate (DBR) 

Statistical Bit Rate (SBR) 
(No existing counterpart, but 

understudy) 
(No existing counterpart, but 

understudy) 



ABR 

Constant Bit Rate (CBR) 
triable Bit Rate (VBR) 
Real time Variable Bit Rate (rt-VBR) 

Unspecified Bit Rate (UBR) 



An ATM service contract for a virtual circuit (VC) con- 
nection or a virtual path (VP) connection may include 
multiple parameters describing the service rate of the con- 
nection. This includes the Peak Cell Rate (PCR), the Sus- 
tainable Cell Rate (SCR) the Intrinsic Burst Tolerance (IBT), 
and the Minimum Cell Rate (MCR). Not all of these 
parameters are relevant for every connection or every ser- 
vice class, but when they are implied or explicitly specified 
elements of the service contract, they must be respected. VC 
connections are the primary focus of the following 
discussion, but it will be understood the VP connections can 
also be so specified. The data transport unit for an ATM 
connection usually is referred to as a "cell/' In this 
disclosure, however, the term "packet" is sometimes used to 
refer to the data transport unit because this more general 
terminology is consistent with some of the broader aspects 
of the innovations. 

The Generic Cell Rate Algorithm (GCRA), which is 
specified in ITU-T Recommendation 1.371, is well suited 
for testing a packet or cell flow for conformance with a 
traffic descriptor. To perform such testing, the GCRA 
requires the specification of an emission interval (i.e., the 
reciprocal of a flow rate) and a tolerance, t. In practice, this 
tolerance may depend on a variety of factors, including the 
connection, the connection setup parameters, or the class of 
service. As will be seen, the GCRA can be employed as a 
Boolean function, where for a flow of fixed size packets or 
cells on a connection, the GCRA (emission interval, 
tolerance) is false if the flow is conforming to a peak rate or 
true if the flow is conforming to a minimum rate. For 
example, a source of cells conforms to a PCR if GCRA 
(1/PCR, x PC ^) is false. Likewise, a connection or flow 
conforms to an MCR if GCRA (1/MCR, x MCJ ^) is false. As 
will be appreciated the "emission interval" is the reciprocal 
of the "cell rate." 

A DBR traffic contract is appropriate for a source which 
establishes a connection in the expectation that a static 
amount of bandwidth will be continuously available to the 
connection throughout its lifetime. Thus, the bandwidth the 
network commits to a DBR connection is characterized by 
a PCR value. Further, the cell or packet flow on such a 
connection complies with the traffic contract if it conforms 
to GCRA (1/PCR, x PCR ). On the other hand, an SBR traffic 
contract is suitable for an application which has known 
traffic characteristics that allow for an informed selection of 
an SCR and x WT , as well as a PCR and %j> CR . An SBR or 
rt-SBR flow complies with its traffic contract if the flow not 
only conforms to GCRA (1/PCR, x PC ^), but also to GCRA 
(1/SCR.W 
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As previously indicated, an ABR traffic contract is appro- 
priate for applications that can tolerate the dynamic varia- 
tions in the information transfer rate that result from the use 
of unreserved bandwidth. A PCR and an MCR are specified 
by the source establishing such a connection, and these 
parameters may be subject to negotiation with the network. 
Thus, the bandwidth that is available on an ABR connection 
is the sum of the MCR (which can be 0) and a variable cell 
rate that results from a sharing of unreserved bandwidth 
among ABR connections via a defined allocation policy (i.e., 
the bandwidth a source receives above its specified MCR 
depends not only on the negotiated PCR, but also on 
network policy). Feedback from the network enables the 
source application to dynamically adjust the rate it feeds 
cells or packets into an ABR connection. An ABR flow 
always complies with its traffic contract if it conforms to 
GCRA (1/MCR, ~c MC ^), and is always non-compliant if it 
does not conform to GCRA (1/PCR, Tpc R ) Conformance in 
the region between MCR and PCR is dependent on the ABR 
feedback and is thus dynamically determined. 

A UBR traffic contract is similar to the ABR contract, 
except that the UBR contract does not accommodate the 
specification of an MCR and has no dynamic conformance 
definition. Therefore, a UBR flow complies with its traffic 
contract if it conforms to GCRA (1/PCR, x Pcr/? ). 

B. Traffic Shaping 

ITU-T Recommendation 1.371 addresses the possibility 
of reshaping traffic at a network element for the purpose of 
bringing the traffic into conformance with a traffic descriptor 
in the following terms: 

"Traffic shaping is a mechanism that alters the traffic 
characteristics of a stream of cells on a VCC or a VPC 
to achieve a desired modification of those traffic 
characteristics, in order to achieve better network effi- 
ciency whilst meeting the QoS objectives or to ensure 
conformance at a subsequent interface. Traffic shaping 
must maintain cell sequence integrity on an ATM 
connection. Shaping modifies traffic characteristics of a 
cell flow with the consequence of increasing the mean 
cell transfer delay. 

Examples of traffic shaping are peak cell rate reduction, 
burst length limiting, reduction of CDV by suitably 
spacing cells in time and queue service schemes. 

It is a network operators choice to determine whether and 
where traffic shaping is performed. As an example, a 
network operator may choose to perform traffic shaping 
in conjunction with suitable UPC/NPC functions. 

It is an operators option to perform traffic shaping on 
separate or aggregate cell flows. 

As a consequence, any ATM connection may be subject to 
traffic shaping. 

The options available to the network operator/service 
provider are the following: 

a. No shaping 

Dimension the network in order to accommodate any 
flow of conforming cells at the network ingress 
whilst ensuring conformance at the network egress 
without any shaping function. 

b. Shaping 

Dimension and operate the network so that any flow of 
conforming cells at the ingress is conveyed by the 
network or network segment whilst meeting QoS 
objectives and apply output shaping the traffic in 
order to meet conformance tests at the egress. 

Shape the traffic at the ingress of the network or 
network segment and allocate resources according to 
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the traffic characteristics achieved by shaping, whilst 
meeting QoS objectives and subsequent conform- 
ance tests at the network or network segment egress. 
Traffic shaping may also be used within the customer 

5 equipment or at the source in order to ensure that the 
cells generated by the source or at the UNI are con- 
forming to the negotiated traffic contract relevant to the 
ATC that is used (see Section 5.5)." ITU-T Recom- 
mendation 1.371, Section 6.2.5. 

20 C. Scheduling for Real Time and No n- Real Time 
Connections/Existing Tools and Techniques 

As is known, if bandwidth is not divided "fairly" between 
applications employing "best efforts" Internet service or 
ABR ATM service a variety of undesirable phenomena may 
occur. See Lefelhocz, Lyles, Shenker and Zhang, "Conges- 

15 tion Control for Best- Effort Service: Why we need a new 
paradigm," IEEE Network, January/February 1996, for fur- 
ther details on mechanisms for best effort/ABR traffic. 

Most ATM switches currently are implemented with FIFO 
queuing. FIFO queuing exhibits pathological behaviors 

20 when used for ABR traffic (see "On Traffic Phase Effects in 
Packet -Switched Gateways", Sally Floyd and Van Jacobson, 
Internetworking: Research and Experience, Vol. 3, pp. 
115-156 (1992), and "Observations on the Dynamics of a 
Congestion Control Algorithm: The effects of Two-Way 

25 Traffic", Lixia Zhang, Scott Shenker, and David Clark, ACM 
Sigcomm 91 Conference, Sep. 3-6, 1991, Zurich, 
Switzerland, pp. 133-148.). FIFO also is unable to protect 
correctly behaving users against misbehaving users (it does 
not provide isolation). As a result of these deficiencies 

30 non-FIFO queuing mechanisms such as weighted fair queu- 
ing (see, for example, A. Demers, S. Keshave, and S. 
Shenker, "Analysis and Simulation of a Fair Queuing 
Algorithm," Proceedings of ACM SigComm, pages 1-12, 
September 1989; and A. K. Parekh "A Generalized Proces- 

35 sor Sharing Approach to Flow Control in Integrated Service 
Networks," Ph.D. Thesis, Department of Electrical Engi- 
neering and Computer Science, MIT, 1992.) or approxima- 
tions to fair queuing such as round-robin (Ellen L. Hahne, 
"Round-robin Scheduling for M ax-Mi n Fairness in Data 

40 Networks," IEEE Journal on Selected Areas in 
Communications, Vol. 9, pp. 1024-1039, September 1991.) 
are often suggested. 

Service classes which have inelastic bandwidth require- 
ments often require that data be transmitted through the 

45 network with bounded jitter (i.e., bounded cell or packet 
delay variation). As shown by the above referenced Parekh 
paper, weighted fair queuing can be used to provide bounded 
jitter for real time streams. Moreover, Parekh's results have 
recently (Pawan Goyal, Simon S. Lam and Harrick M. Vin, 

50 "Determining End-to-End Delay Bounds in Heterogeneous 
Networks," Proceedings of The 5th International Workshop 
on Network and Operating System Support for Digital Audio 
and Video (NOSSDAV), Durham, N.H., Apr. 18-22, 1995.) 
been extended to prove delay bounds for systems using the 

55 closely related mechanisms of Virtual Clock (Lixia Zhang, 
"Virtual Clock: A New Traffic Control Algorithm for Packet 
Switching Networks," Proceedings of ACM SigComm , 
pages 19-29, August 1990.) and Self-clocked Fair Queuing 
(S. J. Golestani, "A Self-Clocked Fair Queuing Scheme for 

60 High Speed Applications," Proceedings oflNFOCOM, pp. 
636-646, 1994). 

Thus, it is known that both elastic (Best effort/ABR) and 
inelastic (or real-time) services can benefit from the use of 
fair queuing and related algorithms. 

65 1. Weighted Fair Queuing and Virtual Clock 

Fair queuing and related algorithms (e.g., frame-based 
fair queuing, deficit round robin, etc.) operate on sequences 
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of packets or other data transport units (e.g., an ATM cell is Priority mechanisms can be either preemptive or non- 

a packet. For ATM to these sequences are identified by either preemptive. This terminology comes from the operating 

the VCI or the VPI, while in the Internet protocol suite the system literature. A non-preemptive priority mechanism 

identification is on the basis of <IP address, protocol, port> assigns a priority to an object (a process in the operating 

triples (IPv4) or flow identifiers (IPv6). In both self-clocked 5 system world, a VC in the ATM world) at a scheduling time, 

weighted fair queuing and virtual clock, packets are ordered and the object then retains this priority until it is served, 

(sorted) by timestamps (schemes such as round-robin pro- Preemptive priority mechanisms, on the other hand, can 

vide approximations to ordering of packets by timestamps). change the priority of objects while they are waiting to be 

These timestamps represent the virtual finishing time (or served. For example, in a preemptive system one could say 

equivalently the virtual starting time for the packet and are 10 "schedule this VC with priority 3 but if it is not served within 

computed by taking a starting time value and adding an 200 microseconds then increase its priority up to 2." 

offset obtained by multiplying the length of the packet by a 3. Work Conserving and Non-work Conserving Queuing 

weight which represents the particular packet sequence's Klein rock, Queuing Systems, Vol. 2: Computer 

share of the bandwidth. Applications, John Wiley & Sons, N.Y., N.Y. 1996, p. 113 

More particularly, for virtual clock the virtual finishing 15 uses the terminology "work conserving" to denote any 

time is computed as: queuing system in which work is neither created nor 

destroyed. In keeping with this terminology a switch which, 
when given queued cells, always transmits cells on the 

°)=° outgoing link is a "work conserving switch". Switches 

,rr*r . ^ fl . i;/ , Tmf . ¥ L . , v m , A 20 employing a pure FIFO, weighted fair queuing or virtual 

VT(f, ;+l)-max{Arnvaltf VT(f, j)}+Lengthtf ;+l)/Rate(/) (1) i i if j r i lu 11 1 • 1 

clock scheduling algorithm are all work conserving. In 
where: VT(f, j) is the virtual finishing time associated with contrast, a non-work conserving switch may choose not to 
packet j of flow (virtual circuit) f* czlte, even when cells are queued for transmission. As 

Arrival(f, j) is the arrival time of packet j of flow f; and wiU be seen > a method of doin S this * t0 program the switch 
Length(f, j) is the length of packet j of flow f. 25 lo wait uritil the curTent time * ec l ual to or g reater lhan the 

Self-clocked weighted fair queuing assigns virtual finish- timestamp associated with a particular cell before transmit- 

ing times according to the formula: l * n S tnat ce ^* 

Work conserving switches attempt to fully utilize the 

transmission link, but do not necessarily remove or prevent 
vr(f f o)=o 30 bursts. In contrast, non-work conserving switches can stra- 

tegically delay cells so as to re-shape traffic to meet a more 
VT(f r ;+i)=max{SystcmVirtuaiTime 3 VT(f, /)}+Lengthtf stringent conformance test (i.e., a GCRA with a smaller x). 

•weighttf) (2) Additionally, a non-work conserving switch in which a 

where: SystemVirtualTime is the virtual time associated given connection is only allocated a specified amount of 
with the packet being served (being output) at the time 35 Pe rform a P ohcm S ( 1D u ?™ terms a 

packet(f j+1) arrives UPC/NPC) by discarding or tagging cells which overflow 

For ATM the packet length is constant because the cells lhe allotted buffer s P ace : e *^ p l e ° f a ^T^ 00 ": 
are of fixed size (i.e., 53 bytes long). Consequently, right- f^^T?^^ ^ Virtual Clock (Sugih 

most term in both Expression (1) and Expression (2) Jamin > Sta I led Virtu f J ork }^ ^A 0 ^" 

becomes a per flow constant. For virtual clock the simplified 40 Computer Science, UCLA Mar. 21 1994), which is an 

adaptation of Lixia Zhang s Virtual Clock algorithm where 
virtual time is not allowed to run faster (it stalls or goes 
non-work conserving) than real-time. Also see, work by 
V7tf /+i)=max{Arrivai(/; vrif, /)}+constant(0 (3) Scott Shenker that is available by FTP at FTP.PARC.X- 

45 EROX.com. 

For self -clocked weighted fair queuing, on the other hand, 4. Calendar Queues 
the simplified expression is: A calendar queue is a time ordered list of actions, each of 

which is dequeued and executed when real-time is equal to 
or greater than the time associated with the action. Calendar 
W, /-f-^-maxiSystemVirtuaiTime, vr(f, /)}+constant (f) (4) 5Q ^ with boimded time intervals can be represented as a 

In other words, an ATM queuing point which implements ^ ***** * as a ' Wwhcel" or "time- 
either virtual clock or self-clocked weighted fair queuing hne - ^^e -wheels assign events to buckets relative to a 
performs the following steps: P ointer ' where lhe bucket index is calculated using anth- 

me tic modulo the wheel size. I nese data structures are well 

1) compute the maximum of (a) the current virtual time 55 kn0WQ ^ lhe Uientan as a queuing mec hanism. In a 
for the VC, and (b) either of i) the arrival time of the time _wheel, absolute time is represented as an offset relative 
cell or n) the system virtual time. tQ the currem time ( « real and each elemem ^ tfae 

2) add to the results of step 1 above a per-VC constant array is a bucket which contains one or more actions 
representing that VC's share of the bandwidth. (typically in linkcd-list form) which are to be executed at the 

3) service cells (transmit them) in order of increasing 60 time assigned to the bucket in which they reside. Any of the 
values of the virtual time stamps assigned by steps 1 buckets of such a time- wheel can be empty, i.e., have no 
and 2. events associated with it. 

2. Priority For every time -wheel, there are two times of interest: 

Giving priority to one traffic class over another means that W /t „, and \ lateso which correspond to the head and tail 

if the higher priority traffic class has cells ready for 65 pointers for the active entries in the array; where i eaHieM is 

transmission, those cells are always transmitted in prefer- the time of the next entry (e.g., packet or cell) to be serviced, 

ence to the cells of the lower priority traffic class. and x Iatest is the time associated with the latest (most distant 



expression is: 
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in time) bucket containing a scheduled event. The difference 
between i marUest and i latest cannot be greater than the length 
of the time-wheel, b, minus 1. This can be ensured by 
viewing the time as being kept modulo b, and by then 
ensuring that no offset (the packet length multiplied by cither 
the rate or the weight in virtual clock or weighted fair 
queuing respectively) is greater than b-1. For an ATM link 
running at OC-3 speeds (149.76 mbps— the SONET payload 
rate) there are approximately 353208 cells/sec on the link. 
Accordingly, if 64 Kbps (voice telephony rates) flows 
(approximately 174 cells/sec when AAL type 1 is used) are 
the lowest speed connections that need to be supported, then 
the ratio of the highest supported rate to the lowest rate is 
2029, which rounds up to 2 11 . This ratio is the maximum 
offset that will get added during the calculation of virtual 
times. Therefore, a time- wheel of length 2030 (2048 to allow 
for rounding up to a power of two) is sufficient to encode the 
virtual times associated with circuits ranging in rates from 
64 Kbps to full OC-3 link rate. 

The length of a time -wheel array can be decreased by 
permitting an array element to contain more than one time 
offset. For example, if the above-described time-wheel is 
reduced to 256 elements from 2048, then each bucket would 
have eight time offsets mapped into it. Actions within a 
single bucket that spans multiple offsets may be performed 
out of order, but between buckets actions will stay in order. 
This reduces the amount of memory that needs to be 
allocated to such a time-wheel at the cost of reducing the 
precision of the ordering of actions in the calendar queue. 

D. Traffic Shaping for Time Multiplexed Flows on Mul- 
tiple Output Channels 

Preferably, any traffic shaping that is needed to bring time 
multiplexed packet or cell flows into conformance with their 
traffic contracts is performed after the completion of all 
switching or routing operations that are required to separate 
flows for different output channels from each other. This 
permits the throughput efficiency of the multiplexer to be 
optimized. 

However, prior output queued ATM switches generally 
have employed FIFO (First In - First Out) output buffers. 
These buffers are not capable of participating in a controlled 
reshaping of any of the flows that pass through them. 
Instead, the per-VC time multiplexed flows that are output 
by these buffers essentially are time multiplexed composites 
of the input flows that are loaded into them. Of course, these 
output flows are time delayed relative to the input flows 
because of the inherent latency of the buffers. Moreover, the 
cell delay variation (CDV) of one or more of these output 
flows may be increased if scheduling conflicts occur among 
the data transport limits of the different flows because these 
conflicts cause so-called "transmit collisions." 

As will be appreciated, increased CDV is especially 
troublesome for traffic, such as DBR traffic, which generally 
has a relatively tight tolerance. Thus, if each hop between a 
source and a destination includes a simple FIFO output 
queue of the foregoing type, it may be necessary to limit the 
number of hops this CDV sensitive traffic is permitted to 
make in order to ensure compliance within its specified 
tolerance. 

Accordingly, there is a need for more efficient and more 
effective traffic shaping mechanisms and processes for ATM 
switches and other routers that route traffic from multiple 
inputs to multiple outputs for time multiplexed output emis- 
sion. 

IV. SUMMARY OF THE INVENTION 

This invention provides rate shaping in per-flow output 
queued routing mechanisms for statistical bit rate service. 
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V. BRIEF DESCRIPTION OF THE DRAWINGS 

Still further objects and advantages of this invention will 
be evident when the following details and description is read 
5 in conjunction with the attached drawings, in which: 

FIG. 1 is a simplified block diagram of an ATM switch in 
which the present invention may be used to advantage; 

FIG. 2 diagrammetrically tracks the various shapes an 
ATM cell may suitably take while traversing the switch 
10 shown in FIG. 1; 

FIG. 3 is a more detailed block diagram of a representa- 
tive channel on the output or transmit side of the chip shown 
in FIG. 1; 

FIG. 4 schematically illustrates a stalled virtual clock 
15 controlled per-VC calendar queue which may be employed 
to carry out the present invention; and 

FIG. 5 schematically illustrates a multiple priority alter- 
native to the embodiment shown in FIG. 5. 

20 VI. DETAILED DESCRIPTION OF THE 

INVENTION 

While the invention is described in some detail herein- 
below with specific reference to certain embodiments, it is 
to be understood that there is no intent to limit it to those 

25 

embodiments. On the contrary, the intent is to cover all 
modifications, alternatives and equivalents falling within the 
spirit and scope of the invention as defined by the appended 
claims. 

3Q A. A Representative Environment 

Turning now to the drawings, and at this point especially 
to FIG. 1, the input and output ports of an ATM switch 21 
typically are coupled to one or more physical layers via 
respective Utopia 2 interfaces and to a switch control 

35 processor module 22 via a second suitable interface. This 
enables the switch 21 to exchange data and control cells with 
any physical layers that are connected to them, and to also 
exchange control cells with the control processor module 22. 
In keeping with standard practices, the communication 

40 channels are unidirectional, so a pair of channels are 
required for bi-directional communications. 

The switch 21 comprises a switching fabric 24, a fabric 
control module 25 and a reservation ring 26 for switching 
data and control cells from input queues to per-VC output 

45 queues. The cells in these queues are stored in the data path 
in data memory 27, and these input and output queues are 
managed by a queue control module 28. Typically, the data 
memory 27 is sized to hold up to roughly 12000 cells. 
Connection records for the data and control cell flows are 

50 stored in the control path in control memory 29, together 
with certain types of control cells which are intercepted by 
a rate based engine and traffic multiplexer 31 for routing to 
the control processor module 22. Suitably, the control RAM 
29 is capable of accommodating up to about 8200 connec- 

55 tion records and 64k cell records. The interaction of the 
control processor module 22 with the switch 21 is beyond 
the scope of the present invention and, therefore, is not 
described herein. However, persons who are familiar with 
ATM switch design will understand that the control proces- 

60 sor is primarily responsible for performing connection estab- 
lishment and termination, as well as OAM (Operation and 
Maintenance) functions. 

The data path of the switch 21 is synchronously clocked 
(by means not shown) at a predetermined rate of, say, 40 

65 MHz. However, in keeping with conventional synchronous 
pipeline design practices, the phase of this clock signal is 
delayed (by means also not shown) by differing amounts at 
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different points along the data path to give the data adequate 
time to settle prior to being transferred from one pipelined 
stage to the next. 

In accordance with standard practices, a source wishing to 
communicate with a destination initiates negotiations with 
the ATM network within which the switch 21 resides by 
sending a SETUP message to the network. This message 
identifies the destination and explicitly or impliedly specifies 
all of the relevant traffic parameters for the requested con- 
nection. If the network is prepared to commit to the traffic 
contract which is defined by these traffic parameters (or a 
modified version of the parameters that the source is willing 
to accept), the network routes the SETUP message to the 
destination. Then, if the destination is ready to receive 
message traffic from the source in accordance with the terms 
of the traffic contract, the destination returns a CONNECT 
message to the source. This CONNECT message confirms 
that a connection has been established on a specified virtual 
circuit (VC) within a specified virtual path (VP) for a cell 
flow that conforms to the traffic contract. See ITU-T Rec- 
ommendation 0.2391 and ATM Forum UNI 4.0 Specifica- 
tion. "Permanent" virtual connections can be established by 
provisioning, without invoking these signaling protocols. 

Data cells begin to flow after a connection is established. 
As shown in FIG. 2, the form of the cells change as they pass 
through the switch 21 because of the operations the switch 
performs. Cells may be replicated within the switch 21 for 
multicasting, but the following discussion will be limited to 
unicast operations to avoid unnecessary complexity. 

As indicated in FIG. 2 at 41, each inbound cell that the 
switch 21 receives has a header containing a VP index and 
a VC index. These indices combine to define a unique 
address for one hop of the connection. A connection may be 
composed of multiple hops, so the VP and VC indices for the 
next hoop are written into the header of the cell as it passes 
through the switch 21 as indicated in FIG. 2 at 42. 

The switch 21 employs the VP and VC indices of the 
inbound cell (FIG. 2 at 41) to compute the address at which 
the connection record for the associated flow resides within 
the control RAM 29. Typically, this connection record 
includes a bit vector for identifying the output port (i.e., the 
switch -level "destination") at which the flow exits the switch 
21, a priority index for identifying the relative priority of the 
flow on a granular priority scale, and a circuit index 
("Circuit Index"*) which uniquely identifies the flow inter- 
nally of the switch 21. As shown in FIG. 2Ca03?these 
tconnecn'orrparameters are^ritte^^ Then, 
the cell is written into the data RAM memory 27, while a 
pointer to the cell is linked into an appropriate one of a 
plurality of FIFO input queues, where the selection of the 
queue is based on the priority of the related flow. 

The relative priorities of the head of queue cells within 
these input queues are examined during each cell time, and 
the head of queue cell having the highest priority is selected 
for arbitration during the next arbitration session. 
Furthermore, the priority of any lower priority head of queue 
cell (i.e., any non-selected head of queue cell) is incremen- 
tally increased (by means not shown), thereby increasing the 
probability of that cell being selected for arbitration during 
the next arbitration session. Therefore, even though the 
higher priority input queues have greater throughput per unit 
time than the lower priority queues, the lower priority 
queues have bounded delay because the priority of their 
head of queue cells increases as a function of time. 

Each arbitration cycle requires one cell lime of the switch 
21, so the routing information for the cells that are selected 
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for arbitration is fed into the reservation ring 26 one cell time 
prior to the release into the switching fabric 24 of the pay 
loads of the cell or cells that win the arbitration. In other 
words, as shown in FIG. 2 at 44, the cells that are received 

5 by the reservation ring 26 and switching fabric 24 are 
composed of the headers of the cells for the next arbitration 
cycle (i.e., the "current cells") followed by the bodies or 
payloads of the cells successfully arbitrated during the 
previous arbitration cycle (i.e., the "previous cells"). Thus, 

1Q when the cell bodies reach the fabric 24, the fabric is already 
configured by the fabric control 25 to route those cells to 
their respective output port destination. For additional infor- 
mation on the reservation ring 26 and the arbitration it 
performs, see Cisneros, A., "Large Packet Switch and Con- 

i5 tention Resolution Device," Proceedings of the XIII Inter- 
national Switching Symposium, 1990, paper 14, Vol. Ill, pp. 
77-83 and Lyles U.S. Pat. No. 5,519,698, which issued May 
21, 1996. 

In the illustrated embodiment, cells are decomposed into 

20 four bit wide nibbles for arbitration and routing. Thereafter 
(with the exception of "idle cells" that may be provided for 
testing the switching processes) the cells are reassembled 
and queued in the data path, the control path, or both (a) for 
time scheduled transfer to the appropriate output ports of the 

25 switch 21 and/or (b) for transfer to the control processor 
module 22. The time scheduled transfer of cells to the output 
ports of the switch 21 is central to many of the traffic shaping 
techniques of the related filings so that subject is discussed 
in further detail hereinbelow. On the other hand, the decom- 

3Q position and reassembly of the cells, the testing processes, 
and the interaction of the control processor 22 with RM 
(Resource Management) and OAM (Operation and 
Maintenance) cells are incidental topics which need not be 
considered in depth. 

35 Referring to FIG. 3, it will be understood that the switch 
21 fans out on the output or transmit side of the switching 
fabric 24. Thus, while one output channel of the switch 21 
is shown, it will be understood that this channel is generally 
representative of the other channels. 

40 As shown, there suitably is a fill cells module 51 for 
accepting cell bodies and their associated circuit indices 
from the switching fabric 24. The "effective cell time" on the 
output side of the switching fabric 24 is determined by the 
ratio of the nominal cell time to the "k" speed-up factor. 

45 Thus, for example, if the nominal cell time is 113 clock 
cycles/cell, the effective cell time on the output side of the 
switching fabric 24 is 56.5 cycles/cell if k«2. 

When a valid cell is received, the fill cell module 51 
typically uses cell structures from a linked and numbered 

50 free fist 52 of such data structures for writing the cell into the 
data memory 27. To that end, the fill cell module 51 suitably 
includes a fetch state machine 53 for fetching cell structures 
form the top of the free list 52 on demand. This enables the 
fill cell module 51 to insert the circuit index for the cell and 

55 a pointer to the location of the cell in data memory 27 into 
an "arrival" message that it sends to notify a cell flow control 
unit 55 of the ceirs arrival. The circuit index enables the 
flow control unit 55 to ascertain the VC or flow to which the 
cell belongs from the connection record in the control 

60 memory 29. This, in turn, enables the cell flow control unit 
55 to check the traffic shaping status of the flow. An 
OAM/RM recognizer 27 advantageously is provided to 
enable the flow control unit 55 to identify these control cells 
and to determine whether they are to be queued in the data 

65 path, the control path, or both. 

The memory pointers for cells of traffic contract compli- 
ant flows are queued in per-VC queues in response to 
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"addCell" messages which the cell flow control unit 55 such as ATM cells, from an output queued routing mecha- 

sends to a queue control unit 58. Each addCell message nism to specified peak data unit flow rates, such as PCRs for 

identifies the cell to which it pertains and the circuit index DBR/CBR ATM service. As previously described, the data 

for the associated flow or VC Moreover, the addCell transport units of the flows that are routed to a given output 

message also indicates whether the cell is to be queued in the 5 port are queued, after being routed, in per-flow queues. The 

data path, the control path, or both. When the cell has been data transport units at the heads of these queues then are 

appropriately queued, the queue control unit 58 returns an admitted by an admission controller 61 (to the exclusion of 

"added" message to the cell flow control unit 55 for noti- all other transport units) for scheduling on a time line 

fying the flow control unit 55 that the newly queued cell calendar queue 63 by a scheduler 62. The scheduler 62, in 

needs to be taken into account during future rate shaping 10 turn, performs per-flow virtual clock computations on these 

computations on the VC to which it belongs. head of queue transport units to schedule them for release 

Advantageously, the queue control unit 58 monitors the from tae calendar queue 63 in accordance with their respec- 

length of the per-VC queues with respect to the depth control tive theoretical finish times, VT(f, j+1), or their respective 

limits that are set on the respective queues. This enables the theoretical start times. Please see Section III.C.4 above, 

queue control unit 58 to initiate congestion control action on 15 Real time advantageously is incremented on the time line 

ABR flows when their per-VC queues become excessively 66 at a rate that enables the shaped, time multiplexed output 

long. It also permits the queue control unit 58 to identify traffic to essentially fill the bandwidth of the output link. As 

flows that are exceeding their traffic contract so that an will be recalled, the maximum number of resolvable time 

appropriate policing function (not shown) can be invoked to slots into which the scheduler 62 can map the head of queue 

drop or log cells of such non-compliant flows. 20 members of the respective flows is based on the ratio of the 

An admission controller 61 monitors the "added" mes- maximum permissible frequency to the minimum permis- 

sages that are returned by the queue control unit 58 to cause sible frequency of those flows. Thus, the rate at which real 

a scheduler 62 to schedule the head of queue cells for the time is incremented from bucket-to-bucket is a rationale 

non-empty per-VC queues on a calendar queue 63 for multiple of the cell rate. 

transmission at scheduled times. Suitably, the scheduler 62 25 Data transport units residing in time slots which represent 

employs a per-VC virtual clock to schedule these head of times that are earlier than or equal to the current real time of 

queue cells on the calendar queue 63 in accordance with reference for the time line 66 are eligible for transmission 

respective virtual finishing times, VT(f, j+1), that the sched- and, therefore, are linked into a transport list 65 as previ- 

uler 62 computes for them (or, alternatively, "virtual start ously described. However, those data transport units which 

times"). Please see Section III.C.l above. 30 reside in time slots that are associated with later time slots 

The calendar queue 63 tracks system "real time" or of the time line 66 remain in a pending state until system real 

"current time" to prevent any of the scheduled cells from time advances sufficiently to reach those time slots. To avoid 

being released for transmission prior to its scheduled time. rollover ambiguities, the time line 66 is designed to ensure 

In other words, the scheduler 62 and the calendar queue 63 that all references to earlier scheduled data transport limits 

implement a stalled virtual clock so that the cells that are 35 are removed from each time slot before any references to 

scheduled for transmission are released for transmission later scheduled transport units are inserted therein in antici- 

only when system real time has at least reached their pation of the next scan. 

respective scheduled transmission times. As illustrated, con- While the above-described arrangement effectively 
nections having cells that have been released for transmis- 4Q shapes conforming DBR/CBR ATM flows to the PCRs 
sion by the calendar queue 63 are linked into a link list of specified by their traffic contracts, it does not aid in bringing 
connections that have cells ready for transmission on a the Cell Delay Variation (CDVs) of those flows into con- 
transmit list 65. formity with the z PCR parameters of their traffic contracts. 

The calendar queue 63 notifies the flow control unit 55 C. Multiple Priority Levels for Minimizing Relative CD V 

whenever it releases a cell for transmission on any given 45 In accor dance with the present invention, data transport 

connection. Ihe flow control unit 55, in turn, requests the uaits that are delivered l0 a multiplexing point, such as an 

reference to the next cell (i.e., the new head of queue cell), outpm ^ of an switchj by flows haviog different 

if any, on the per-VC queue for the given connection and frequencies are prioritized so that the data transport units of 

notifies the admission controller 61 that it should admit this the mgher f reqU ency flows are given transmit priority over 

reference to the scheduler 62 for scheduling. Thus, the 50 an y data transport units of lower frequency flows with which 

admission controller 61 effectively engages in closed loop they happen to collide M shown m FIG 3 this tnmsmit 

communications with the calendar queue 63 to ensure that priority can be i mp i em ented by steering the data transport 

the head of queue cells which it admits for scheduling units that are admitted for scheduling by a stalled virtual 

thereon are admitted to the exclusion of all other cells in the clock scheduling mechanism 63 or the like to one or another 

per-VC queues. Thus, the calendar queue 63 may be imple- 5S of a p i uraut y of priority rank ordered time-lines 66a-66e or 

mented by employing one or more time bounded time- output FIFO queues based on the frequencies of the flows to 

wheels or "time fines," 66. Please see Section III.C.4 above. which lhose respective data transport units belong. For 

The time span of these time-wheels must be at least as long examp ie, for an ATM switch, it may be advisable to imple- 

as the period of the lowest frequency flows that the system ment on thc order of five differ cnt frequency dependent/class 

is designed to support to prevent time wrap induced ambi- 60 of ^ice dependent output priorities, including (1) a top 

guities and preferably is twice as long so relative times can priority for ^Us from flows tnat nave negot iated output rates 

be compared using two's complement computations. of at least i/ J6 of the m rate of the output link (i e f its 

B. Shaping Flows of Fixed Bit Length Data Transport aggregate bandwidth), (2) a second priority for cells form 

Limits to Specified Peak Flow Rates flows having negotiated output rates ranging from V\e to V^se 

Referring to FIG. 4, it will be evident that stalled virtual 65 of the output link rate, and (3) a third priority for cells from 

clock transmission control is well suited for shaping time flows having negotiated output rates ranging from V2S6 to 

multiplexed flows of fixed bit length data transport units, Vitooe of the link rate. The lower two priorities then suitably 
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are established for ABR connections that have non-zero and forth as required between the PCR emission interval 

negotiated MCR rates and for UBR connections and ABR T=1/PCR and the SCR emission interval T=1/SCR. 

connections that have MCR rates of 0, respectively. Alternatively, as shown in FIG. 5, the admission control - 

As will be appreciated the present invention effectively ler ma y admit cells from the flow f for scheduling by the 

reduces the CD Vs of the higher frequency flows without 5 scheduler 62 on a relatively low priority stalled virtual clock 

materially increasing the CDVs of the lower frequency calendar queue 81 or on a relatively high priority stalled 

flows. As a general rule, the CDV that is tolerable is a virmal clock calendar queue 82. If it is determined at 75 that 

function of the negotiated rate for a flow. For example, a the , aggregate rate at which the cells of this flow f are being 

pn ,, f inn ii *• * i *u * K output form these calendar queues 81 and 82, conforms to 

CDV of 100 cell times is very large with respect to an „JL A /c ™ v . . 1t c t , a c *i_ 

. , • . • t ic n ,ri T „ < t in GCRA (1/SCR, x IBT \ incommg cells of the flow f then are 

expected emission interval of one cell every 10 cells, but 10 , , , v , 7 , • i j D1 , it _ 

r - * vc*u j ■ , i . . . 4 . . scheduled on the low priority calendar queue 81 by the 

generally insignificant it the negotiated emission interval is , < , « . A ? t , „™ / 

to . 3 « Aon „ to scheduler 62 in accordance with the negotiated PCR for the 

only one cell every 2029 cells. - _ . it _ , , . . , , . & , t . , , 

J ' flow. On the other hand, if it is determined at 75 that the rate 

When a calendar queue mechanism is employed to sched- at which lhe ceUs of lhe flow f are being oulput does not 

ule the data transport units or cells of the different frequency conform to GCRA (1/SCR, x IBT \ incoming cells of that flow 

flows for transmission, the high frequency high priority then are schedu i ed on the high priority calendar queue 82 by 

flows need to be resolved to the precision of a single cell the scheduler 62 in accordance with the negotiated SCR 

time while being scheduled to achieve an acceptably low parameter. As shown, steerin r logic-83"steers^Toutput^of 

CDV, but the low frequency/low priority flows can be more ihe sche^uleZ62;to:thTliigh priority or low priority calendar 

coarsely resolved to a precision of, say, 16 cell times. This 45^81-or-82-de^ 

means that the number of time slots on the calendar queue ^75EreTSns-a4rue-or- false stater^ 

63 can be reduced. This enables the amount of memory that Any V^a^isiiolFHoUisions that occ"ur are resolved in 

is required to implement the calendar queue 63 to be reduced fayor of outputting the SCR scheduled cells from the high 

at the cost of losing some typically unneeded precision in the iori g2 iof tQ ^ pcR scheduled ce Hs f r0 m the 

scheduling of the head of queue cells of the lower frequency ^ low prk)rity queue g2 ^ enabJes eadief ^haMcd cells to 

s ' be output in advance of subsequently scheduled cells, 

It is to be understood that the frequency based prioritiza- thereby preserving cell order, 

tion technique which the present invention provide for What is claimed: 

resolving transmission conflicts at multiplex points among i \ n a switch for a packet switched communication 

flows of different nominally fixed frequencies may be 3Q syste m, an apparatus for serially emitting packets of mul- 

employed in many different applications for reducing the liplc time multiplexed flows in substantial compliance with 

relative jitter of the flows, including in applications having network traffic contracts corresponding to the respective 

work conserving per-flow output queues for feeding cells or flowSj at least one 0 f sa j d tra fg c contracts specifying a 

other data transport units into such a multiplex point. sustainable packet emission rate and a peak packet emission 

D. Traffic Shaping to SCR/IBT and PCR Parameters for 35 rate, said apparatus comprising: 

Real Time and Non-Real Time SBR/VBR Service a queuing mechanism, said queuing mechanism organiz- 

As previously pointed out, an SBR flow (equivalent to a ing pending packets in respective queues in accordance 

VBR flow) conforms with its traffic contract if it not only with an oldest pending packet at head of queue order; 

conforms to its negotiated GCRA (1/PCR, x PCR ) but also to and 

its negotiated GCRA (1/SCR, x JBT ). In accordance with this 40 a scheduling mechanism coupled to said queuing media - 
invention, it has been recognized that a stalled virtual clock n ism including a non-work conserving calendar queue, 
calendar queue 63 (FIG. 3) can be employed to bring one or said scheduling mechanism scheduling pending pack- 
more flows into conformance with such a traffic contract at ets of respective flows on said non-work conserving 
the egress of a network element, such as an output port of the calendar queue for emission (i) at said peak packet 
ATM switch 21 (FIG. 1). To that end, the flow of cells (i.e., 45 emission rate if the packets are being output at a rate 
data transport units) from the output of the calendar queue less than said sustainable packet emission rate and (ii) 
are shaped to the negotiated PCR parameter for the flow a t said sustainable packet emission rate if the packets 
unless and until GCRA (1/SCR, x /57 ) is true, at which point are being output at a rate substantially equal to or 
the output flow is shaped to its negotiated SCR parameter. greater than said sustainable packet emission rate. 

As shown in FIG. 4, one way to accomplish this is to 50 2, The apparatus of claim 1, wherein said traffic contracts 

enqueue admitting successive cells of a flow, f, into the further specify a numerically quantifiable burst tolerance 

associated per- VC queue for scheduling by the scheduler 62 associated with said sustainable packet emission rate and 

at an initial rate determined by the negotiated PCR param- said queues have depths selected to enforce said burst 

eter for that flow. If it is determined at 75 that the cells of the tolerance. 

flow f are being output at a rate conforming to the GCRA 55 3. The apparatus of claim 1, wherein said non-working 

(1/SCR, t /bt ) for that flow, the scheduler 62 is controlled to conserving calendar queue is a stalled virtual clock calendar 

continue to schedule subsequent cells of the flow, f, on the queue. 

calendar queue 63 in accordance with its PCR parameter. On 4. The apparatus of any of claims 1-3, wherein said 

the other hand, if it is determined at 75 that the cells of the packets are of uniform fixed bit length, 

flow f are being output from the calendar queue 63 at a rate 60 5. The apparatus of claim 4, wherein said packets are fixed 

that does not conform to the GCRA (1/SCR, x IBJ ) for the byte length cells for asynchronous transfer mode commu- 

flow, the scheduler 62 then is controlled to schedule these nications. 

subsequent cells on the calendar queue 63 in accordance 6. A switch for a packet switched communication system 

with the negotiated SCR parameter for the flow, f . As will be including a traffic shaper for serially emitting packets of 

appreciated, the incremental amount by which the virtual 65 multiple time multiplexed flows in substantial compliance 

clock for each successive cell of the flow f is incremented in with individual network traffic contracts for the respective 

the connection record for the flow f is easy to switch back flows, at least one of said traffic contracts specifying a 
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sustainable packet emission rate and a peak packet emission 

rate, said switch comprising: 

a queuing mechanism, said queuing mechanism organiz- 
ing pending packets in respective queues in accordance 
with an oldest pending packet at a head of a queue 5 
order; and 

a scheduling mechanism coupled to said queuing mecha- 
nism including a relatively high priority non-work 
conserving calendar queue and a relatively low priority 
calendar queue, said scheduling mechanism scheduling 
pending packets of respective flow (i) on said low 
priority non-work conserving calendar queue at said . 
peak packet emission rate if the packets are being 
output at a rate less than said sustainable packet emis- 
sion rate and (ii) on said high priority non-work con- 15 
serving calendar queue at said sustainable packet emis- 
sion rate if the packets are being output at a rate 
substantially equal to or greater than said sustainable 
packet emission rate. 
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7. The switch of claim 6, wherein said non-work con- 
serving calendar queues are stalled virtual clock calendar 
queues. 

8. The switch of claim 7, wherein said traffic contracts 
further specify a burst tolerance associated with said sus- 
tainable packet emission rate and said queues have depths 
selected to enforce said burst tolerance. 

9. The switch of claim 8, wherein said packets are of 
uniform, predetermined bit length. 

10. The switch of claim 9, wherein said packets are fixed 
byte length cells for asynchronous transfer mode commu- 
nications. 

11. The switch of claim 7, wherein said packets are 
scheduled on said non-work conserving calendar queues for 
emission at virtual times. 

12. The switch of claim 11, wherein emission intervals 
between packets scheduled for emission on said non-work 
conserving calendar queues are real-time offsets. 



02/06/2004, EAST Version: 1.4.1 



